Method and apparatus of current picture referencing for video coding using affine motion compensation

ABSTRACT

A method and apparatus for a video coding system with the current picture referencing (CPR) mode enabled are disclosed. According to one method, if one reference picture index for the current block points to the current image, the affine motion compensation is inferred as Off for the current block without a need for signalling or parsing an affine mode syntax or the adaptive MV resolution is inferred as On for the current block without a need for signalling or parsing an adaptive motion vector resolution syntax. In another method, if the affine mode is used for the current block or if the adaptive MV resolution is not used for the current block, a reference picture index for the current block is signalled or parsed and the reference picture index always points to one reference picture other than the current image.

CROSS REFERENCE TO RELATED APPLICATIONS

The present invention claims priority to U.S. Provisional PatentApplication, Ser. No. 62/342,883 filed on May 28, 2016. The U.S.Provisional Patent Application is hereby incorporated by reference inits entirety.

FIELD OF THE INVENTION

The present invention relates to block partition for coding and/orprediction process in video coding. In particular, the present inventiondiscloses various coding arrangements for a coding system using currentpicture referencing (CPR).

BACKGROUND AND RELATED ART

The High Efficiency Video Coding (HEVC) standard is developed under thejoint video project of the ITU-T Video Coding Experts Group (VCEG) andthe ISO/IEC Moving Picture Experts Group (MPEG) standardizationorganizations, and is especially with partnership known as the JointCollaborative Team on Video Coding (JCT-VC). In HEVC, one slice ispartitioned into multiple coding tree units (CTU). In main profile, theminimum and the maximum sizes of CTU are specified by the syntaxelements in the sequence parameter set (SPS). The allowed CTU size canbe 8×8, 16×16, 32×32, or 64×64. For each slice, the CTUs within theslice are processed according to a raster scan order.

The CTU is further partitioned into multiple coding units (CU) to adaptto various local characteristics. A quadtree, denoted as the codingtree, is used to partition the CTU into multiple CUs. Let CTU size beM×M, where M is one of the values of 64, 32, or 16. The CTU can be asingle CU (i.e., no splitting) or can be split into four smaller unitsof equal sizes (i.e., M/2×M/2 each), which correspond to the nodes ofthe coding tree. If units are leaf nodes of the coding tree, the unitsbecome CUs. Otherwise, the quadtree splitting process can be iterateduntil the size for a node reaches a minimum allowed CU size as specifiedin the SPS (Sequence Parameter Set). This representation results in arecursive structure as specified by a coding tree (also referred to as apartition tree structure) 120 in FIG. 1. The CTU partition 110 is shownin FIG. 1, where the solid lines indicate CU boundaries. The decisionwhether to code a picture area using Inter-picture (temporal) orIntra-picture (spatial) prediction is made at the CU level. Since theminimum CU size can be 8×8, the minimum granularity for switchingbetween different basic prediction types is 8×8.

Furthermore, according to HEVC, each CU can be partitioned into one ormore prediction units (PU). Coupled with the CU, the PU works as a basicrepresentative block for sharing the prediction information. Inside eachPU, the same prediction process is applied and the relevant informationis transmitted to the decoder on a PU basis. A CU can be split into one,two or four PUs according to the PU splitting type. HEVC defines eightshapes for splitting a CU into PU as shown in FIG. 2, including 2N×2N,2N×N, N×2N, N×N, 2N×nU, 2N×nD, nL×2N and nRx2N partition types. Unlikethe CU, the PU may only be split once according to HEVC. The partitionsshown in the second row correspond to asymmetric partitions, where thetwo partitioned parts have different sizes.

After obtaining the residual block by the prediction process based on PUsplitting type, the prediction residues of a CU can be partitioned intotransform units (TU) according to another quadtree structure which isanalogous to the coding tree for the CU as shown in FIG. 1. The solidlines indicate CU boundaries and dotted lines indicate TU boundaries.The TU is a basic representative block having residual or transformcoefficients for applying the integer transform and quantization. Foreach TU, one integer transform having the same size to the TU is appliedto obtain residual coefficients. These coefficients are transmitted tothe decoder after quantization on a TU basis.

The terms coding tree block (CTB), coding block (CB), prediction block(PB), and transform block (TB) are defined to specify the 2-D samplearray of one colour component associated with CTU, CU, PU, and TU,respectively. Thus, a CTU consists of one luma CTB, two chroma CTBs, andassociated syntax elements. A similar relationship is valid for CU, PU,and TU. The tree partitioning is generally applied simultaneously toboth luma and chroma, although exceptions apply when certain minimumsizes are reached for chroma.

Alternatively, a binary tree block partitioning structure is proposed inJCTVC-P1005 (D. Flynn, et al, “HEVC Range Extensions Draft 6”, JointCollaborative Team on Video Coding (JCT-VC) of ITU-T SG 16 WP 3 andISO/IEC JTC 1/SC 29/WG 11, 16th Meeting: San Jose, US, 9-17 Jan. 2014,Document: JCTVC-P1005). In the proposed binary tree partitioningstructure, a block can be recursively split into two smaller blocksusing various binary splitting types as shown in FIG. 3. The mostefficient and simplest ones are the symmetric horizontal and verticalsplit as shown in the top two splitting types in FIG. 3. For a givenblock of size M×N, a flag is signalled to indicate whether the givenblock is split into two smaller blocks. If yes, another syntax elementis signalled to indicate which splitting type is used. If the horizontalsplitting is used, the given block is split into two blocks of sizeM×N/2. If the vertical splitting is used, the given block is split intotwo blocks of size M/2×N. The binary tree splitting process can beiterated until the size (width or height) for a splitting block reachesa minimum allowed block size (width or height). The minimum allowedblock size can be defined in high level syntax such as SPS. Since thebinary tree has two splitting types (i.e., horizontal and vertical), theminimum allowed block width and height should be both indicated.Non-horizontal splitting is implicitly implied when splitting wouldresult in a block height smaller than the indicated minimum.Non-vertical splitting is implicitly implied when splitting would resultin a block width smaller than the indicated minimum. FIG. 4 illustratesan example of block partitioning 410 and its corresponding binary tree420. In each splitting node (i.e., non-leaf node) of the binary tree,one flag is used to indicate which splitting type (horizontal orvertical) is used, where 0 may indicate horizontal splitting and 1 mayindicate vertical splitting.

The binary tree structure can be used for partitioning an image areainto multiple smaller blocks such as partitioning a slice into CTUs, aCTU into CUs, a CU into PUs, or a CU into TUs, and so on. The binarytree can be used for partitioning a CTU into CUs, where the root node ofthe binary tree is a CTU and the leaf node of the binary tree is CU. Theleaf nodes can be further processed by prediction and transform coding.For simplification, there is no further partitioning from CU to PU orfrom CU to TU, which means CU equal to PU and PU equal to TU. Therefore,in other words, the leaf node of the binary tree is the basic unit forprediction and transforms coding.

Binary tree structure is more flexible than quadtree structure sincemore partition shapes can be supported, which is also the source ofcoding efficiency improvement. However, the encoding complexity willalso increase in order to select the best partition shape. In order tobalance the complexity and coding efficiency, a method to combine thequadtree and binary tree structure, which is also called as quadtreeplus binary tree (QTBT) structure, has been disclosed. According to theQTBT structure, a block is firstly partitioned by a quadtree structureand the quadtree splitting can be iterated until the size for asplitting block reaches the minimum allowed quadtree leaf node size. Ifthe leaf quadtree block is not larger than the maximum allowed binarytree root node size, it can be further partitioned by a binary treestructure and the binary tree splitting can be iterated until the size(width or height) for a splitting block reaches the minimum allowedbinary tree leaf node size (width or height) or the binary tree depthreaches the maximum allowed binary tree depth. In the QTBT structure,the minimum allowed quadtree leaf node size, the maximum allowed binarytree root node size, the minimum allowed binary tree leaf node width andheight, and the maximum allowed binary tree depth can be indicated inthe high level syntax such as in SPS. FIG. 5 illustrates an example ofblock partitioning 510 and its corresponding QTBT 520. The solid linesindicate quadtree splitting and dotted lines indicate binary treesplitting. In each splitting node (i.e., non-leaf node) of the binarytree, one flag indicates which splitting type (horizontal or vertical)is used, 0 may indicate horizontal splitting and 1 may indicate verticalsplitting.

The above QTBT structure can be used for partitioning an image area(e.g. a slice, CTU or CU) into multiple smaller blocks such aspartitioning a slice into CTUs, a CTU into CUs, a CU into PUs, or a CUinto TUs, and so on. For example, the QTBT can be used for partitioninga CTU into CUs, where the root node of the QTBT is a CTU which ispartitioned into multiple CUs by a QTBT structure and the CUs arefurther processed by prediction and transform coding. Forsimplification, there is no further partitioning from CU to PU or fromCU to TU. That means CU equal to PU and PU equal to TU. Therefore, inother words, the leaf node of the QTBT structure is the basic unit forprediction and transform.

An example of QTBT structure is shown as follows. For a CTU with size128×128, the minimum allowed quadtree leaf node size is set to 16×16,the maximum allowed binary tree root node size is set to 64×64, theminimum allowed binary tree leaf node width and height both is set to 4,and the maximum allowed binary tree depth is set to 4. Firstly, the CTUis partitioned by a quadtree structure and the leaf quadtree unit mayhave size from 16×16 (i.e., minimum allowed quadtree leaf node size) to128×128 (equal to CTU size, i.e., no split). If the leaf quadtree unitis 128×128, it cannot be further split by binary tree since the sizeexceeds the maximum allowed binary tree root node size 64×64. Otherwise,the leaf quadtree unit can be further split by binary tree. The leafquadtree unit, which is also the root binary tree unit, has binary treedepth as 0. When the binary tree depth reaches 4 (i.e., the maximumallowed binary tree as indicated), no splitting is implicitly implied.When the block of a corresponding binary tree node has width equal to 4,non-horizontal splitting is implicitly implied. When the block of acorresponding binary tree node has height equal to 4, non-verticalsplitting is implicitly implied. The leaf nodes of the QTBT are furtherprocessed by prediction (Intra picture or Inter picture) and transformcoding.

For I-slice, the QTBT tree structure usually applied with theluma/chroma separate coding. For example, the QTBT tree structure isapplied separately to luma and chroma components for I-slice, andapplied simultaneously to both luma and chroma (except when certainminimum sizes being reached for chroma) for P- and B-slices. In otherwords, in an I-slice, the luma CTB has its QTBT-structured blockpartitioning and the two chroma CTBs have another QTBT-structured blockpartitioning. In another example, the two chroma CTBs can also havetheir own QTBT-structured block partitions.

For block-based coding, there is always a need to partition an imageinto blocks (e.g. CUs, PUs and TUs) for the coding purpose. As known inthe field, the image may be divided into smaller images areas, such asslices, tiles, CTU rows or CTUs before applying the block partition. Theprocess to partition an image into blocks for the coding purpose isreferred as partitioning the image using a coding unit (CU) structure.The particular partition method to generate CUs, PUs and TUs as adoptedby HEVC is an example of the coding unit (CU) structure. The QTBT treestructure is another example of the coding unit (CU) structure.

Current Picture Referencing

Motion estimation/compensation is a well-known key technology in hybridvideo coding, which explores the pixel correlation between adjacentpictures. In a video sequence, the object movement between neighbouringframes is small and the object movement can be modelled bytwo-dimensional translational motion. Accordingly, the patternscorresponding to objects or background in a frame are displaced to formcorresponding objects in the subsequent frame or correlated with otherpatterns within the current frame. With the estimation of a displacement(e.g. using block matching techniques), the pattern can be mostlyreproduced without the need to re-code the pattern. Similarly, blockmatching and copy has also been tried to allow selecting the referenceblock from within the same picture. It was observed to be not efficientwhen applying this concept to videos captured by a camera. Part of thereasons is that the textual pattern in a spatial neighbouring area maybe similar to the current coding block, but usually with some gradualchanges over space. It is thus difficult for a block to find an exactmatch within the same picture of video captured by a camera. Therefore,the improvement in coding performance is limited.

However, the spatial correlation among pixels within the same picture isdifferent for screen content. For typical video with text and graphics,there are usually repetitive patterns within the same picture. Hence,Intra (picture) block compensation has been observed to be veryeffective. A new prediction mode, i.e., the Intra block copy (IBC) modeor called current picture referencing (CPR), has been introduced forscreen content coding to utilize this characteristic. In the CPR mode, aprediction unit (PU) is predicted from a previously reconstructed blockwithin the same picture. Further, a displacement vector (called blockvector or BV) is used to signal the relative displacement from theposition of the current block to the position of the reference block.The prediction errors are then coded using transformation, quantizationand entropy coding. An example of CPR compensation is illustrated inFIG. 6, where area 610 corresponds to a picture, a slice or a picturearea to be coded. Blocks 620 and 630 correspond to two blocks to becoded. In this example, each block can find a corresponding block in theprevious coded area in the current picture (i.e., 622 and 632respectively). According to this technique, the reference samplescorrespond to the reconstructed samples of the current decoded pictureprior to in-loop filter operations including both deblocking and sampleadaptive offset (SAO) filters in HEVC.

An early version of CPR was disclosed in JCTVC-M0350 (Madhukar Budagavi,et al, “AHG8: Video coding using Intra motion compensation”, JointCollaborative Team on Video Coding (JCT-VC) of ITU-T SG 16 WP 3 andISO/IEC JTC 1/SC 29/WG 11, 13th Meeting: Incheon, KR, 18-26 Apr. 2013,Document: JCTVC-M0350), which is submitted as a candidate technology forHEVC Range Extensions (RExt) development. In t JCTVC-M0350, the CPRcompensation was limited to be within a small local area and the searchis limited to 1-D block vector for the block size of 2N×2N only. Later,a more advanced CPR method was developed during the standardization ofHEVC SCC (screen content coding).

In order to signal the block vector (BV) efficiently, the BV issignalled predictively using a BV predictor (BVP) in a similar fashionas the MV coding. Accordingly, the BV difference (BVD) is signalled andthe BV can be reconstructed according to BV=BVP+BVD as shown in FIG. 7,where reference block 720 is selected as IntraBC prediction for thecurrent block 710 (i.e., a CU). A BVP is determined for the current CU.Methods to derive the motion vector predictor (MVP) is known in thefield. Similar derivation can be applied to BVP derivation.

When CPR is used, only part of the current picture can be used as thereference picture. Some bitstream conformance constraints are imposed toregulate the valid MV value referring to the current picture.

First, one of the following two equations must be true:

BV_x+offsetX+nPbSw+xPbs−xCbs<=0, and  (1)

BV_y+offsetY+nPbSh+yPbs−yCbs<=0.  (2)

Second, the following WPP (Wavefront Parallel Processing) condition mustbe true:

(xPbs+BV_x+offsetX+nPbSw−1)/CtbSizeY−xCbs/CtbSizeY<=yCbs/CtbSizeY−(yPbs+BV_y+offsetY+nPbSh−1)/CtbSizeY  (3)

In equations (1) through (3), (BV_x, BV_y) is the luma block vector(i.e., the motion vector for CPR) for the current PU; nPbSw and nPbShare the width and height of the current PU; (xPbS, yPbs) is the locationof the top-left pixel of the current PU relative to the current picture;(xCbs, yCbs) is the location of the top-left pixel of the current CUrelative to the current picture; and CtbSizeY is the size of the CTU.OffsetX and offsetY are two adjusted offsets in two dimensions inconsideration of chroma sample interpolation for the CPR mode:

offsetX=BVC_x&0×7?2:0,  (4)

offsetY=BVC_y&0×7?2:0.  (5)

(BVC_x, BVC_y) is the chroma block vector, in ⅛-pel resolution in HEVC.

Third, the reference block for CPR must be within the same tile/sliceboundary.

Affine Motion Compensation

The affine model can be used to describe 2D block rotations, as well as2D deformations of squares (rectangles) into parallelogram. This modelcan be described as follows:

x′=a0+a1*x+a2*y,

y′=b0+b1*x+b2*y.  (6)

In this model, 6 parameters need to be determined. For each pixels (x,y) in the area of interest, the motion vector is determined as thedifference between location of the given pixel (A) and he location ofits corresponding pixel in the reference block (A′), i.e.,MV=A′−A=(a0+(a1−1)*x+a2*y, b0+b1*x+(b2−1)*y). Therefore, the motionvector for each pixel is location dependent.

According to this model, if the motion vectors of three differentlocations are known, then the above parameters can be solved. It isequivalent to the condition that the 6 parameters are known. Eachlocation with a known motion vector is referred as a control point. The6-parameter affine model corresponds to a 3-control-point model.

In the technical literature by Li, et al. (“An affine motioncompensation framework for high efficiency video coding”, in 2015 IEEEInternational Symposium on Circuits and Systems (ISCAS), 24-27 May 2015,Pages: 525-528) and by Huang et al. (“Control-Point Representation andDifferential Coding Affine-Motion Compensation”, IEEE Transactions onCircuits, System and Video Technology (CSVT), Vol. 23, No. 10, pages1651-1660, October 2013), some exemplary implementations of affinemotion compensation are presented. In the technical literature by Li, etal., an affine flag is signalled for the 2N×2N block partition, whencurrent block is coded in either Merge mode or AMVP mode. If this flagis true, the derivation of motion vectors for the current block followsthe affine model. If this flag is false, the derivation of motionvectors for the current block follows the traditional translationalmodel. Three control points (3 MVs) are signalled when affine AMVP modeis used. At each control point location, the MV is predictively coded.Later, the MVDs of these control points are coded and transmitted. Inthe technical literature by Huang et al., different control pointlocations and predictive coding of MVs in control points are studied.

A syntax table for an affine motion compensation implementation is shownin Table 1. As shown in Table 1, syntax element use_affine_flag issignalled if at least one Merge candidate is affine coded and thepartition mode is 2N×2N (i.e., PartMode==PART_2N×2N) as indicated byNotes (1-1) to (1-3) for the Merge mode. Syntax element use_affine_flagis signalled if the current block size is larger than 8×8 (i.e., (log2CbSize>3) and the partition mode is 2N×2N (i.e., PartMode==PART_2N×2N)as indicated by Notes (1-4) to (1-6) for the B slice. If use_affine_flagindicates the affine model being used (i.e., use_affine_flag having avalue of 1), information of other two control points is signalled forreference list L0 as indicated by Notes (1-7) to (1-9) and informationof other two control points is signalled for reference list L1 asindicated by Notes (1-10) to (1-12).

TABLE 1 Note prediction_unit( x0, y0, nPbW, nPbH ) { if( cu_skip_flag[x0 ] [ y0 ] ) { if( MaxNumMergeCand > 1 ) merge_idx[ x0 ][ y0 ] } else {/* MODE_INTER */ merge_flag[ x0 ] [ y0 ] if( merge_flag[ x0 ] [ y0 ] ) {if( at least one merge candidate is affine coded && PartMode = =PART_2Nx2N) 1-1 use_affine_flag 1-2 else 1-3 if( MaxNumMergeCand > 1 )merge_idx[ x0 ][ y0 ] } else { if( slice_type = = B ) inter_pred_idc[ x0] [ y0 ] if( log2CbSize > 3 && PartMode = = PART_2Nx2N) 1-4use_affine_flag 1-5 if( inter_pred_idc[ x0 ][ y0 ] != PRED_L1 ) { 1-6if( num_ref_idx_l0_active_minus1 > 0 ) ref_idx_l0[ x0 ][ y0 ]mvd_coding( x0, y0, 0 ) if( use_affine_flag){ 1-7 mvd_coding( x0, y0, 0) /* second control point when affine mode is used */ 1-8 mvd_coding(x0, y0, 0 ) /* third control point when affine mode is used */ 1-9 }mvp_l0_flag[ x0 ] [ y0 ] } if( inter_pred_idc[ x0 ][ y0 ] != PRED_L0 ) {if( num_ref_idx_l1_active_minus1 > 0 ) ref_idx_l1[ x0 ][ y0 ] if(mvd_l1_zero_flag && inter_pred_idc[ x0 ][ y0 ] = = PRED_BI ) { MvdL1[ x0][ y0 ][ 0 ] = 0 MvdL1[ x0 ][ y0 ][ 1 ] = 0 } else mvd_coding( x0, y0, 1) if( use_affine_flag){ 1-10 mvd_coding( x0, y0, 1 ) /* second controlpoint when affine mode is used */ 1-11 mvd_coding( x0, y0, 1 ) /* thirdcontrol point when affine mode is used */ 1-12 } mvp_l1_flag[ x0 ][ y0 ]} } } }

In the present invention, various aspects of CPR coding with the QTBTstructure or luma/chroma separate coding are addressed.

BRIEF SUMMARY OF THE INVENTION

A method and apparatus for video encoding and decoding used by a videoencoding system and video decoding system respectively are disclosed.According to one method of the present invention, input data associatedwith a current block in a current image are received, where affinemotion compensation or adaptive motion vector (MV) resolution is enabledfor coding the current image. One or more reference picture indexes fora current block are signalled or parsed. If one reference picture indexfor the current block points to the current image, the affine motioncompensation is inferred as Off for the current block without a need forsignalling or parsing an affine mode syntax or the adaptive MVresolution is inferred as On for the current block without a need forsignalling or parsing an adaptive motion vector resolution syntax.

For the above method, when the affine motion compensation is enabled forcoding the current image and if the reference picture index for thecurrent block points to one reference picture other than the currentimage, the affine mode syntax is signalled or parsed to determinewhether the affine motion compensation is applied to the current blockaccording to one embodiment. In another embodiment, when the adaptive MVresolution is enabled for coding the current image and if the referencepicture index for the current block points to one reference pictureother than the current image, the adaptive MV resolution syntax issignalled or parsed to determine whether the adaptive MV resolution isapplied to the current block. If a target reference picture index for atarget reference picture list points to one reference picture other thanthe current image and an MVD (motion vector difference) value associatedwith of the target reference picture list is not equal to zero, theadaptive MV resolution syntax is signalled or parsed, where the targetreference picture list corresponds to List 0 or List 1. If a referencepicture index points to the current image or a MVD (motion vectordifference) value associated with one reference picture list is equal tozero for one or both reference picture lists, the adaptive MV resolutionsyntax is not signalled or parsed. The reference picture index for thecurrent block is signalled or parsed before an affine mode syntax or anadaptive MV resolution mode syntax is signalled or parsed.

According to another method, input data associated with a current blockin a current image are received, where current picture referencing (CPR)mode is enabled, and wherein affine motion compensation or adaptivemotion vector (MV) resolution is enabled for coding the current image.Whether affine mode is used for the current block when the affine motioncompensation is enabled for the current image is determined, or whetherthe adaptive MV resolution is used for the current block when theadaptive MV resolution is enabled for the current image is determined.If the affine mode is used for the current block or if the adaptive MVresolution is not used for the current block, a reference picture indexfor the current block is signalled or parsed, where the referencepicture index always points to one reference picture other than thecurrent image.

According to the above method, a codeword for the reference pictureindex corresponding to the current image may be removed from a codewordtable if the affine mode is used for the current block or if theadaptive MV resolution is not used for the current block. Alternatively,a conforming video bitstream may be used to cause the reference pictureindex to point to one reference picture other than the current image ifthe affine mode is used for the current block or if the adaptive MVresolution is not used for the current block. Said determining whetheraffine mode is used for the current block may comprise signalling orparsing an affine mode syntax before signalling or parsing the referencepicture index for the current block. The reference picture index for thecurrent block can be signalled or parsed after an affine mode syntax oran adaptive MV resolution mode syntax is signalled or parsed. If theaffine mode is not used for the current block, a reference picture indexfor the current block can be signalled or parsed, and where a codewordof the reference picture index corresponding to the current image may beincluded in a codeword table. Furthermore, if the adaptive MV resolutionis used for the current block, a reference picture index for the currentblock is signalled or parsed, and wherein a codeword of the referencepicture index corresponding to the current image is included in acodeword table.

According to yet another method, input data associated with associatedwith a current block in a current image are received, where currentpicture referencing (CPR) mode and affine motion compensation areenabled for coding the current image. An affine mode and a referencepicture index for the current block are determined. If the affine modeis used for the current block, a number of motion vectors (MVs) for thecurrent block are signalled or parsed for a reference picture listdepending on whether the reference picture index for the current blockpoints to the current image.

According to the above method, if the affine mode is used for thecurrent block and the reference picture index for the current blockpoints to the current image, only one MV can be signalled or parsed forthe current block for the reference picture list. Each MV can berepresented by one MV predictor and one MV difference, or one MVpredictor index and one MV difference. If the affine mode is used forthe current block and the reference picture index for the current blockpoints to one reference image other than the current image, more thanone MV can be signalled or parsed for the current block for thereference picture list.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates an example of block partition using quadtreestructure to partition a coding tree unit (CTU) into coding units (CUs).

FIG. 2 illustrates asymmetric motion partition (AMP) according to HighEfficiency Video Coding (HEVC), where the AMP defines eight shapes forsplitting a CU into PU.

FIG. 3 illustrates an example of various binary splitting types used bya binary tree partitioning structure, where a block can be recursivelysplit into two smaller blocks using the splitting types.

FIG. 4 illustrates an example of block partitioning and itscorresponding binary tree, where in each splitting node (i.e., non-leafnode) of the binary tree, one syntax is used to indicate which splittingtype (horizontal or vertical) is used, where 0 may indicate horizontalsplitting and 1 may indicate vertical splitting.

FIG. 5 illustrates an example of block partitioning and itscorresponding quad-tree plus binary tree structure (QTBT), where thesolid lines indicate quadtree splitting and dotted lines indicate binarytree splitting.

FIG. 6 illustrates an example of CPR compensation, where area 610corresponds to a picture, a slice or a picture area to be coded. Blocks620 and 630 correspond to two blocks to be coded.

FIG. 7 illustrates an example of predictive block vector (BV) coding,where the BV difference (BVD) corresponding to the difference between acurrent BV and a BV predictor is signalled.

FIG. 8 illustrates examples of constrained reference pixel region forIntraBC mode (i.e., the current picture referencing, CPD mode).

FIG. 9 illustrates an example of ladder-shaped reference data area forWPP (wavefront parallel processing) associated with the CPR mode.

FIG. 10 illustrates an example of collocated colour plane candidatederivation from other colour planes in the same frame, where (Y1, U1,V1) and (Y2, U2, V2) are colour planes of two successive frames.

FIG. 11 illustrates a flowchart of an exemplary coding system with theaffine motion compensation or adaptive motion vector (MV) resolutionmode enabled according to an embodiment of the present invention, whereif one reference picture index for the current block points to thecurrent image, the affine motion compensation is inferred as Off for thecurrent block without a need for signalling or parsing an affine modesyntax or the adaptive MV resolution is inferred as On for the currentblock without a need for signalling or parsing an adaptive motion vectorresolution syntax.

FIG. 12 illustrates a flowchart of an exemplary coding system with thecurrent picture referencing (CPR) mode enabled according to anembodiment of the present invention, where if the affine mode is usedfor the current block or if the adaptive MV resolution is not used forthe current block, a reference picture index for the current block issignalled or parsed and the reference picture index always points to onereference picture other than the current image.

FIG. 13 illustrates a flowchart of an exemplary coding system with thecurrent picture referencing (CPR) mode enabled according to anembodiment of the present invention, where if the affine mode is usedfor the current block, a number of motion vectors (MVs) for the currentblock for a reference picture list is signalled or parsed depending onwhether the reference picture index for the current block points to thecurrent image.

DETAILED DESCRIPTION OF THE INVENTION

The following description is of the best-contemplated mode of carryingout the invention. This description is made for the purpose ofillustrating the general principles of the invention and should not betaken in a limiting sense. The scope of the invention is best determinedby reference to the appended claims.

In the video coding based on original quad-tree plus binary tree (QTBT)structure and the luma/chroma separate coding, the luma and chroma arecoded separately for all Intra frames (for example, I-slice). However,in HEVC-SCC, the CPR is designed for three colour components joint. TheMV of CPR is used for all three components. According to one aspect ofthe present invention, the CPR design is modified when the luma andchroma are coded separately. In this disclosure, various methods of CPRcoding with the luma/chroma separate CU structure are proposed asfollows. In the following, various aspects of using the CPR mode forluma/chroma separate coding are disclosed.

CPR with Affine Motion Compensation

When affine motion compensation is used for a block, more than one MV isused for the current PU. Therefore, more than one MVD needs to besignalled in affine AMVP mode. According to one embodiment of thepresent invention, the affine motion compensation mode is disabled whenthe CPR mode is used. In other words, when the CPR mode is used for theblock, the block is coded using a coding mode selected from a codinggroup excluding the affine motion compensation mode. To achieve this,the reference picture index for encoding/decoding current predictionunit (PU) is signalled/parsed first in the decoding order for eitherList 0 or List 1 before the affine mode syntax. Using List 0 as anexample, if the reference picture index in List 0 for the current PU ispointing to the current image itself, there will be only one MVinformation (e.g. MVP/MVP index and MVD) to be encoded/decoded in List0. At the same time, there is no need to signal/parse the affine modesyntax (e.g. the affine mode flag) in the encode/decoder side and theaffine mode flag is inferred as false (affine mode is disabled), whereaffine mode syntax indicates whether affine AMVP mode is used forcurrent PU. If the reference picture index points to a reference pictureother than the current image itself, the affine mode syntax needs to beencoded or decoded. If the affine mode is used, information for morethan one MV may need to be encoded or decoded. Similar methods may applyto List 1 reference pictures as well.

It is possible that in one reference picture list, CPR is used, while inanother reference picture list, affine mode is used.

In another embodiment, the affine mode syntax is encoded or decodedbefore the reference picture index is encoded or decoded. For a currentblock, if the affine mode is applied (i.e., affine mode flag equal totrue), the encoded or decoded reference picture index cannot point tothe current image itself. In one example, the reference picture indexcodeword corresponding to the current picture is removed from thecodeword table. If the affine mode is not used for the current block, areference picture index for the current block is signalled or parsed,wherein if the reference picture list includes the current picture, acodeword of the reference picture index corresponding to the currentimage is included in a codeword table. In another example, bitstreamconformance requires that the encoded or parsed reference picture indexshall not be equal to the reference picture index pointing to thecurrent image self if the affine mode is applied for the current block.

In another embodiment, information associated with a number of encodedor decoded MVs (e.g. MVP/MVP index and MVD) depends on the conditions ofthe affine mode and the CPR mode. If the affine mode is applied and theCPR mode is not applied for the block (e.g. the reference picture indexpointing to a reference picture other than the current image),information associated with more than one MV is encoded or decoded inthe current reference picture list. If the affine mode is applied andthe CPR mode is applied (e.g. the reference picture index pointing tothe current picture), information associated with only one MV is encodedor decoded in the current reference picture list. If the affine mode isnot applied, information associated with only one MV is encoded ordecoded in the current reference picture list.

CPR with Adaptive Motion Vector Resolution

If the CPR mode is used for coding a current PU and the assumption ofadaptive MV resolution (integer MV resolution) in the CPR mode is true,there is no need for signalling the integer MV syntax for a CPR codedPU. The reference picture index in List 0 (or List 1) is decoded in thedecoding order. If the reference picture is the current picture itself,it is allowed that the decoded MVD has a non-zero value. In this case,there is no need to signal the integer MV syntax (e.g. iMV flag). Onlywhen there is a non-zero MVD value for a reference picture other thanthe current picture itself, the integer MV syntax needs to be signalled.Some of the examples for whether to signal the iMV flag are shown inTable 2, where exemplary combinations of reference picture selection andMVD value for iMV flag signalling decision are shown. List 0 and List 1in Table 2 can be swapped.

TABLE 2 Ref. Picture Reference Reference Reference Reference ReferenceList picture MVD picture MVD picture MVD picture MVD picture MVD List 0Current — Current — Current — Other =0 Other =0 picture picture picturepicture picture List 1 Current — Other =0 Other ! = 0 Other ! = 0 Other=0 picture picture picture picture picture iMV_flag No No Yes Yes Nosignalling

When CPR is applied, the MV is coded in integer MV resolution where theadaptive motion vector resolution is inferred as enabled. To achievethis, the reference picture index for encoding/decoding currentprediction unit (PU) is signalled or parsed first in the decoding orderfor either List 0 or List 1 before the adaptive motion vector resolutionsyntax. Using List 0 as an example, if the reference picture index inList 0 for the current PU points to the current image itself, there isno need to signal or parse the adaptive motion vector resolution syntax(e.g. adaptive motion vector flag) in the encode or decoder siderespectively and the adaptive motion vector resolution flag is inferredas true (i.e., adaptive motion vector resolution enabled). If thereference picture index points a reference picture other than thecurrent picture itself, the adaptive motion vector resolution needs tobe encoded or decoded. Similar methods apply to List 1 referencepictures as well.

In another embodiment, the adaptive motion vector resolution syntax isencoded or decoded before the reference picture index is encoded ordecoded. For a current block, if the adaptive motion vector resolutionis not applied (i.e., adaptive motion vector resolution flag equal tofalse), the encoded or decoded reference picture index cannot point tothe current image itself. In one example, the reference picture indexcodeword that corresponding to the current picture is removed from thecodeword table. If the adaptive MV resolution is used for the currentblock, a reference picture index for the current block is signalled orparsed, wherein if the reference picture list includes the currentpicture, a codeword of the reference picture index corresponding to thecurrent image is included in a codeword table. In another example, thebitstream conformance requires that the signalled or parsed referencepicture index shall not be equal to the reference picture index pointingto the current image if the adaptive MV resolution is not applied forthe current block.

The inventions disclosed above can be incorporated into various videoencoding or decoding systems in various forms. For example, theinventions can be implemented using hardware-based approaches, such asdedicated integrated circuits (IC), field programmable logic array(FPGA), digital signal processor (DSP), central processing unit (CPU),etc. The inventions can also be implemented using software codes orfirmware codes executable on a computer, laptop or mobile device such assmart phones. Furthermore, the software codes or firmware codes can beexecutable on a mixed-type platform such as a CPU with dedicatedprocessors (e.g. video coding engine or co-processor).

FIG. 11 illustrates a flowchart of an exemplary coding system with theaffine motion compensation or adaptive motion vector (MV) resolutionmode enabled according to an embodiment of the present invention. Thesteps shown in the flowchart, as well as other following flowcharts inthis disclosure, may be implemented as program codes executable on oneor more processors (e.g., one or more CPUs) at the encoder side and/orthe decoder side. The steps shown in the flowchart may also beimplemented based hardware such as one or more electronic devices orprocessors arranged to perform the steps in the flowchart. According tothis method, input data associated with a current image are received instep 1110, where the affine motion compensation or adaptive motionvector (MV) resolution is enabled for coding the current image. At theencoder side, the input data may correspond to video data to be encoded.At the decoder side, the input data may correspond to compressed videodata to be decoded. In step 1120, one or more reference picture indexesfor a current block is signalled or parsed. As is understood, one ormore reference picture indexes for a current block is signalled at theencoder side or parsed at the decoder side. In step 1130, if onereference picture index for the current block points to the currentimage, the affine motion compensation is inferred as Off for the currentblock without a need for signalling or parsing an affine mode syntax orthe adaptive MV resolution is inferred as On for the current blockwithout a need for signalling or parsing an adaptive motion vectorresolution syntax.

FIG. 12 illustrates a flowchart of an exemplary coding system with thecurrent picture referencing (CPR) mode enabled according to anembodiment of the present invention. According to this method, inputdata associated with a current image is received in step 1210, whereincurrent picture referencing (CPR) mode is enabled, and wherein affinemotion compensation or adaptive motion vector (MV) resolution is enabledfor coding the current image. In step 1220, whether affine mode is usedfor the current block when the affine motion compensation is enabled forthe current image is determined, or whether the adaptive MV resolutionis used for the current block when the adaptive MV resolution is enabledfor the current image is determined. In step 1230, if the affine mode isused for the current block or if the adaptive MV resolution is not usedfor the current block, a reference picture index for the current blockis signalled or parsed, wherein the reference picture index alwayspoints to one reference picture other than the current image. As isunderstood that the reference picture index for the current block issignalled at the encoder side or parsed at the decoder side.

FIG. 13 illustrates a flowchart of an exemplary coding system with thecurrent picture referencing (CPR) mode enabled according to anembodiment of the present invention. According to this embodiment, inputdata associated with a current image is received in step 1310, whereincurrent picture referencing (CPR) mode and affine motion compensationare enabled for coding the current image. Affine mode and referencepicture index is determined for the current block in step 1320. As isknown in the field, the encoder may determine whether to use the affinemode for the current block based on a certain performance criterion suchas rate-distortion optimization (RDO). At the decoder side, whether theaffine mode is used for the block can be determined from codedinformation, such as a syntax element in the video bitstream. In step1330, if the affine mode is used for the current block, a number ofmotion vectors (MVs) for the current block for a reference picture listis signalled or parsed depending on whether the reference picture indexfor the current block points to the current image.

The flowcharts shown are intended to illustrate an example of videocoding according to the present invention. A person skilled in the artmay modify each step, re-arranges the steps, split a step, or combinesteps to practice the present invention without departing from thespirit of the present invention. In the disclosure, specific syntax andsemantics have been used to illustrate examples to implement embodimentsof the present invention. A skilled person may practice the presentinvention by substituting the syntax and semantics with equivalentsyntax and semantics without departing from the spirit of the presentinvention.

The above description is presented to enable a person of ordinary skillin the art to practice the present invention as provided in the contextof a particular application and its requirement. Various modificationsto the described embodiments will be apparent to those with skill in theart, and the general principles defined herein may be applied to otherembodiments. Therefore, the present invention is not intended to belimited to the particular embodiments shown and described, but is to beaccorded the widest scope consistent with the principles and novelfeatures herein disclosed. In the above detailed description, variousspecific details are illustrated in order to provide a thoroughunderstanding of the present invention. Nevertheless, it will beunderstood by those skilled in the art that the present invention may bepracticed.

Embodiment of the present invention as described above may beimplemented in various hardware, software codes, or a combination ofboth. For example, an embodiment of the present invention can be one ormore circuit circuits integrated into a video compression chip orprogram code integrated into video compression software to perform theprocessing described herein. An embodiment of the present invention mayalso be program code to be executed on a Digital Signal Processor (DSP)to perform the processing described herein. The invention may alsoinvolve a number of functions to be performed by a computer processor, adigital signal processor, a microprocessor, or field programmable gatearray (FPGA). These processors can be configured to perform particulartasks according to the invention, by executing machine-readable softwarecode or firmware code that defines the particular methods embodied bythe invention. The software code or firmware code may be developed indifferent programming languages and different formats or styles. Thesoftware code may also be compiled for different target platforms.However, different code formats, styles and languages of software codesand other means of configuring code to perform the tasks in accordancewith the invention will not depart from the spirit and scope of theinvention.

The invention may be embodied in other specific forms without departingfrom its spirit or essential characteristics. The described examples areto be considered in all respects only as illustrative and notrestrictive. The scope of the invention is therefore, indicated by theappended claims rather than by the foregoing description. All changeswhich come within the meaning and range of equivalency of the claims areto be embraced within their scope.

1. A method of video encoding and decoding used by a video encodingsystem and video decoding system respectively, the method comprising:receiving input data associated with a current block in a current image,wherein affine motion compensation or adaptive motion vector (MV)resolution is enabled for coding the current image; signalling orparsing one or more reference picture indexes for a current block; andif one reference picture index for the current block points to thecurrent image, inferring the affine motion compensation as Off for thecurrent block without a need for signalling or parsing an affine modesyntax or inferring the adaptive MV resolution as On for the currentblock without a need for signalling or parsing an adaptive motion vectorresolution syntax.
 2. The method of claim 1, wherein when the affinemotion compensation is enabled for coding the current image and if thereference picture index for the current block points to one referencepicture other than the current image, the affine mode syntax issignalled or parsed to determine whether the affine motion compensationis applied to the current block.
 3. The method of claim 1, wherein whenthe adaptive MV resolution is enabled for coding the current image andif the reference picture index for the current block points to onereference picture other than the current image, the adaptive MVresolution syntax is signalled or parsed to determine whether theadaptive MV resolution is applied to the current block.
 4. The method ofclaim 1, wherein if a target reference picture index for a targetreference picture list points to one reference picture other than thecurrent image and an MVD (motion vector difference) value associatedwith the target reference picture list is not equal to zero, theadaptive MV resolution syntax is signalled or parsed, and wherein thetarget reference picture list corresponds to List 0 or List
 1. 5. Themethod of claim 1, wherein if a reference picture index points to thecurrent image or a MVD (motion vector difference) value associated withone reference picture list is equal to zero for one or both referencepicture lists, the adaptive MV resolution syntax is not signalled orparsed.
 6. The method of claim 1, wherein the reference picture indexfor the current block is signalled or parsed before the affine modesyntax or an adaptive MV resolution mode syntax is signalled or parsed.7. An apparatus of video encoding and decoding used by a video encodingsystem and video decoding system respectively, the apparatus comprisingone or more electronic circuits or processors arrange to: receive inputdata associated with a current block in a current image, wherein affinemotion compensation or adaptive motion vector (MV) resolution is enabledfor coding the current image; signal or parse one or more referencepicture indexes for a current block; and if one reference picture indexfor the current block points to the current image, infer the affinemotion compensation as Off for the current block without a need forsignalling or parsing an affine mode syntax or infer the adaptive MVresolution as On for the current block without a need for signalling orparsing an adaptive motion vector resolution syntax. 8-20. (canceled)